CVLNet: Cross-view Semantic Correspondence Learning for Video-Based Camera Localization
نویسندگان
چکیده
This paper tackles the problem of Cross-view Video-based camera Localization (CVL). The task is to localize a query by leveraging information from its past observations, i.e., continuous sequence images observed at previous time stamps, and matching them large overhead-view satellite image. critical challenge this learn powerful global feature descriptor for sequential ground-view while considering domain alignment with reference images. For purpose, we introduce CVLNet, which first projects into an overhead view exploring ground-and-overhead geometric correspondences then leverages photo consistency among projected form representation. In way, cross-view differences are bridged. Since usually pre-cropped regularly sampled, there always misalignment between location image center. Motivated this, propose estimating camera’s relative displacement before similarity matching. estimation process, also consider uncertainty location. example, unlikely be on top trees. To evaluate performance proposed method, collect Google Map KITTI dataset construct new video-based localization benchmark dataset, KITTI-CVL. Extensive experiments have demonstrated effectiveness over single image-based superiority each module other alternatives.
منابع مشابه
In-House Handmade Camera for Indocyanine Green Video Angiography
Background and Aim: Vascular imaging during surgical procedures is very important and has many applications. There are several methods for intraoperative vascular assessment such as intraoperative angiography, Doppler, and fluorescence-based techniques. The latter group and specially the Indocyanine Green (ICG) Video Angiography (VA) is commonly used for vascular surgery and sentinel node...
متن کاملVideo Processing on Vehicle Front-View Camera
In this project, image processing techniques are used to develop basic understanding on video recording taken by vehicle’s front-view camera. By post processing on the video recording, the computer is able to extract road sign and traffic light information, and make prediction on how fast the vehicle is moving and steering. The road sign recognition is able to identify most of the signs in the ...
متن کاملCross-lingual Parse Disambiguation based on Semantic Correspondence
We present a system for cross-lingual parse disambiguation, exploiting the assumption that the meaning of a sentence remains unchanged during translation and the fact that different languages have different ambiguities. We simultaneously reduce ambiguity in multiple languages in a fully automatic way. Evaluation shows that the system reliably discards dispreferred parses from the raw parser out...
متن کاملAn active camera system for acquiring multi-view video
A system is described for acquiring multi-view video of a person moving through the environment. A real-time tracking algorithm adjusts the pan, tilt, zoom and focus parameters of multiple active cameras to keep the moving person centered in each view. The output of the system is a set of synchronized, time-stamped video streams, showing the person simultaneously from several viewpoints.
متن کاملImplementation of an acoustic localization algorithm for video camera steering
Advances in wireless networking and sensor network design open new prospective in performing demanding collaborative sensing and signal processing. This project was motivated by the ambitious plan of designing and implementing acoustic applications in low-cost distributed wireless sensor networks. As first step, we considered the problem of acoustic localization through a set of distributed sen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2023
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-031-26319-4_8